Prediction of adolescent idiopathic scoliosis with machine learning algorithms using brain volumetric measurements

Abstract Background It is known that neuroanatomical and neurofunctional changes observed in the brain, brainstem and cerebellum play a role in the etiology of adolescent idiopathic scoliosis (AIS). This study aimed to investigate whether volumetric measurements of brain regions can be used as predictive indicators for AIS through machine learning techniques. Methods Patients with a severe degree of curvature in AIS (n = 32) and healthy individuals (n = 31) were enrolled in the study. Volumetric data from 169 brain regions, acquired from magnetic resonance imaging (MRI) of these individuals, were utilized as predictive factors. A comprehensive analysis was conducted using the twelve most prevalent machine learning algorithms, encompassing thorough parameter adjustments and cross‐validation processes. Furthermore, the findings related to variable significance are presented. Results Among all the algorithms evaluated, the random forest algorithm produced the most favorable results in terms of various classification metrics, including accuracy (0.9083), AUC (0.993), f1‐score (0.970), and Brier score (0.1256). Additionally, the most critical variables were identified as the volumetric measurements of the right corticospinal tract, right corpus callosum body, right corpus callosum splenium, right cerebellum, and right pons, respectively. Conclusion The outcomes of this study indicate that volumetric measurements of specific brain regions can serve as reliable indicators of AIS. In conclusion, the developed model and the significant variables discovered hold promise for predicting scoliosis development, particularly in high‐risk individuals.

diagnosis and treatment of AIS with cost-effective methods, such as exercise, alleviate the substantial financial burden on both individuals and society. 4cent studies over the past two decades have increasingly explored the role of the nervous system in the development of AIS. 2,5 is suggested that the connection between brain anomalies and spinal deformities is primarily attributed to neuroanatomical and neurofunctional alterations observed in the brain, brainstem, and cerebellum. 2 Research has provided evidence of variations in both the cortex and white matter structure of the brain and cerebellum in individuals with AIS compared to their healthy counterparts. 6,7Additionally, it has been noted that cortico-cortical inhibition (It is the suppression of some unwanted impulses produced by the cerebral cortex by the cerebral cortex) is significantly reduced on the concave side of the curvature in individuals with AIS. 8 These collective findings underscore the association between the central nervous system and AIS. 5 Electroencephalography, MRI, and transcutaneous electrical stimulation methods have all been used to identify variations in the central nervous system in individuals with AIS. 9 MRI is a non-invasive imaging technology that creates detailed 3D anatomical images through the use of magnets and radio frequency waves. 10MRI serves to assess the human brain from both structural and functional perspectives, utilizing techniques such as volumetric analysis, shape analysis, voxelbased morphometry calculations, cortical thickness measurement, tissue analysis, diffusion tensor imaging, and functional MRI. 11chine learning (ML)-based modeling is a relatively novel analytical method that has gained prominence, primarily in the development of predictive models within the domain of medical research. 10,12ML is predominantly leveraged in medical research for disease classification, clinical decision-making, and the formulation of novel treatment strategies. 13,14Despite the increasing momentum in medical research using ML, the existing body of research on AIS is still notably lacking.
6][17] Wang et al. 15 developed a deep learning model for predicting the progression of AIS during the initial clinic visit.Furthermore, another study used a ML approach to classify scoliosis patients based on their trunk surface asymmetry patterns. 16Additionally, a random forest (RF) model was formulated to identify critical prognostic features for curve progression and predict the final major Cobb angle. 17Furthermore, previous research has focused on assessing cortical thickness and brain white matter in individuals with AIS using brain MRI. 6,7However, as of now, there has been a notable absence of ML-based models for predicting AIS based on brain MRI volume measurements.Comprehensive investigation of brain volume measurements using ML algorithms in individuals with AIS may allow a clearer understanding of the role of the brain in the etiology of AIS.It may also contribute to understanding the link between brain region volumetric changes and the development of AIS in individuals.In this way, it can contribute significantly to the treatment management of individuals who may develop AIS.This entails the utilization of volume data derived from 169 distinct brain regions acquired from MRI images of individuals with mean age range of (15 ± 2.5) in the healthy group and age range of (15.6 ± 2.1) in the patient group.The details related to the characteristics of the demographic data are also presented in Table 1.
The study aimed to estimate AIS using ML algorithms, as opposed to traditional statistical approaches.For this purpose, volume data obtained from 169 different brain regions of AIS and healthy individuals were processed in ML algorithms.It was aimed to predict the onset of AIS by comparing the data of AIS and healthy individuals.

| Study design
The cross-sectional study was carried out on girls with severe curvature who were recommended scoliosis surgery by an orthopedist and healthy girls with similar demographic characteristics.Before starting the study, ethical approval was obtained from Hitit University Ethics Committee (2022-23, date: 04.11.2022).Since all participants were under the age of 18, written and verbal consent was obtained from the parents of the participants for their participation in the study.The study was conducted in accordance with the Helsinki Declaration.

| Data processing
The parcel part of the study: it was performed in MriStudio and Mri-Cloud software using MRI data in DICOM format.The segmentation process was started by converting the MRI data in DICOM format to ".dpf" format.After these procedures, the entry was made by selecting "DTI processing" under the diffusion tensor imaging ("DTI") tab on the http://www.braingps.mricloud.orgwebsite.The files previously converted to ".dpf" format were converted to ".zip" format, uploaded to the relevant area and sent to the server.The file, which was processed by the server, was downloaded in ".zip" format with an ID number.The files named "DtiSeg_tensor_fa.hdror DtiSeg_tensor_fa.
img" in the ".zip" format file were opened in the ROIEditor program.
Following these operations performed in ROIEditor, a parcelation map was opened on the images to be parceled, and measurements and calculations were made for 168 regions of the brain.
MriStudio (https://MriStudio.org)program: it consists of three separate software: DtiStudio for opening and saving images, ROIEditor for creating masks from images, and DiffeoMap for linear and nonlinear image transformation.
In the MriStudio program, it is important to correctly enter the participant's age and gender and the brand information of the MRI device on which the MRI is performed.

| Experimental settings
Two stages were followed in the data analysis process of the study.
The first stage covers statistical comparisons of the volume measurements of left and right cerebral hemispheres between the groups.The main objectives of this stage are to acquire an idea about the dispersion of the measurements, to observe possible group differences in a statistical perspective, and to derive baseline insights for machine learning models, which is the main focus of this study.Therefore, following the first stage, comprehensive performance comparisons of ML algorithms have been conducted in the second stage.

| Statistical analysis
The statistical tests were conducted on IBM SPSS 25

| Machine learning algorithms
In the second stage, eleven different ML algorithms logistic regression, 18 naive bayes, k nearest neighbors, 19 support vector machines, 20 RF, 21 linear discriminant analysis, 22 multilayer perceptron, 23 C5.0, 17 bagging, 21 extreme gradient boosting 24 and MARS 25  and the false positive rate across various threshold settings.The results of the analysis were obtained using R software 26 as well as tidyverse 27 and tidymodels 28 packages.

| RESULTS
Demographic characteristics such as age and body mass index were similar in both groups (p > 0.05) (Table 1).
The statistical analysis between the healthy and patient groups revealed that cuneus, cerebellum, corticospinal tract, inferior fronto-occipital fasciculus, external capsule, body of corpus callosum, splenium of corpus callosum, retrolenticular part of internal capsule and pons measurements of the right cerebral hemisphere were statistically significant (p < 0.05) (Appendix A, Table S4).The statistical analysis between the healthy and patient groups revealed that parahippocampal gyrus, splenium of corpus callosum, nucleus accumbens and tapatum measurements of the left cerebral hemisphere were statistically significant (p < 0.05) (Appendix A, Table S5).
Performance comparisons of ML algorithms for both training and test data based on right cerebral hemisphere measurements are given in Tables 2 and 3 S5).The results of ML algorithms are analyzed and it is found that they produce very weak and practically non-functional values.
The values achieved on the training dataset (Appendix B, Table S5) were relatively reasonable, but did not yield generalizable results for all of the criteria considered on the test dataset (Appendix C, Table S6).Results show that the highest accuracy and F1-score are obtained in logistic regression (accuracy: 0.6300, F1-score: 0.7197), the highest AUC is obtained in linear discriminant analysis (AUC: 0.7556) and the lowest Brier score is obtained in RF (Brier: 0.2378).
The results of left cerebral hemisphere measurements are therefore omitted from the main focus of the study due to the low performance values and the fact that no single algorithm appears to be dominant (Table S7).for the purpose of assessing curve progression and predicting disease exacerbation in individual patients. 15,17,24In the current study, different from previous studies, we first used volumetric measurements of brain areas to test whether ML algorithms can distinguish individuals with scoliosis from healthy individuals and to test the performance of volumetric measurements of brain regions in predicting scoliosis.
Twelve distinct ML models, including logistic regression, naive Bayes, RF, and XGBoost, were utilized.Nevertheless, in the current study, more favorable results in scoliosis prediction were achieved through the application of the RF model (Tables 2 and 3).
While the precise etiopathogenesis of AIS remains elusive, the disparities observed within the central nervous system underline the undeniable role of the brain in the development of scoliosis.In the study, it was observed that individuals with Lenke Type I AIS had differences in the volumes of 9 regions in the right hemisphere (Table S4) and four regions in the left hemisphere of the brain (Table S5) compared to healthy individuals.These differences in brain regions of individuals with AIS underline the undeniable role of the brain in the etiopathogenesis of AIS.
Significantly, the most pronounced variations, relative to healthy individuals, were observed within the right hemisphere, particularly in the regions governing the concave side of the curvature.Notably, substantial distinctions emerged in the neural pathways responsible for intra-hemispheric and inter-hemispheric connectivity, particularly within the areas associated with vision and motor control.It has been postulated that issues related to postural balance contribute to the etiopathogenesis and progression of scoliosis. 29 the present study, distinct variations were observed in several brain regions associated with vision, further underlining the potential link between AIS and visual perception.Notably, the cuneus (Brodmann area 17) in the occipital lobe, a region crucial for primary visual processing, showed differences.Similarly, the fusiform gyrus, responsible for advanced visual information processing, particularly in face recognition, exhibited variances.Additionally, differences were detected in the retrolenticular part of the internal capsule, connected to the optic radiation and vision, as well as the volume of the right optic tract, the pathway responsible for carrying sensory information.The inferior occipital gyrus, associated with visual processing, also displayed distinctions when compared to healthy individuals (Tables S4 and S5).
Achieving postural balance necessitates the effective transmission of sensory inputs from the visual, vestibular, and somatosensory systems to the brain. 30The sense of vision serves as an essential information source, conveying stimuli from both within the body and the external environment. 31In the case of visually impaired individuals, disruptions in normal head position and shoulder symmetry have been reported, often leading to conditions such as genu valgus in the knees and the development of spinal deformities like scoliosis, thoracic kyphosis, or lumbar lordosis. 32Additionally, Catanzariti et al. 33 reported a fivefold increase in scoliosis cases among visually impaired individuals compared to the control group.Batin et al. 34 documented a higher degree of visual field deviation in individuals with AIS than in the control group.Their study data also revealed asymmetry in right and left visual field tests among individuals with AIS. 34e study findings suggest that brain regions linked to visual processing could be a key source of vision-related issues in individuals with AIS.The observed distinctions in brain regions responsible for the transmission and interpretation of visual sensations significantly strengthen the association between AIS and visual perception.Furthermore, the study highlights that the right hemisphere of individuals with AIS is more affected in terms of visual fields, which may account for the asymmetry observed in right and left visual field tests among these individuals.
In this study, more disparities were identified in the right brain hemisphere, which is responsible for the concave side of the curvature in individuals with AIS, in comparison to healthy individuals.Of particular note was the reduction in the volumes of the gyrus precentralis and tractus corticospinalis, both of which play critical roles in motor control (Tables S4 and S5).
The corticospinal tract, often referred to as the pyramidal tract, serves as the principal neural pathway facilitating voluntary motor function. 35Studies have reported that individuals with scoliosis exhibit asymmetry between the right and left corticospinal tracts.
Additionally, it has been noted that both corticospinal tracts in individuals with scoliosis are weaker compared to their healthy counterparts. 5,6In their study, Payas et al. 5  In this study, it was observed that individuals with AIS exhibited lower volumes in the right brain hemisphere, particularly in the gyrus precentralis and tractus corticospinalis, which are responsible for motor control on the concave side of the curvature.Conversely, the volumes of the right red nucleus and right caudate nucleus, which indirectly contribute to motor control, were found to be larger (Table S4).
In response to issues arising in any particular region, the brain often uses a compensation mechanism by adapting brain regions with similar functions to compensate for the affected area. 37The precentral cortex, located in the frontal lobe, is responsible for the voluntary control of motor movements and serves as the origin for numerous motor pathways, including the corticospinal tract, corticobulbar tract, and cortico-rubrospinal tract. 38The caudate nucleus is a paired, "C"-shaped subcortical structure located deep in the brain, near the thalamus.When combined with the putamen, the pair is referred to as the striatum, and they typically function collectively.The striatum is the primary source of input for the basal ganglia, which also encompasses the globus pallidus, subthalamic nucleus, and substantia nigra.Together, these deep brain structures predominantly regulate voluntary skeletal movement.Input to the caudate nucleus typically originates from the cortex, most frequently from the ipsilateral frontal lobe.Efferent projections from the caudate nucleus extend to the hippocampus, globus pallidus, and thalamus. 39Regarding the red nucleus, it primarily controls vertebrates lacking a significant corticospinal tract.However, in primates where the corticospinal system is dominant, it is considered vestigial. 40The human red nucleus plays a crucial role in various aspects of motor control. 41 this study, the increased volume in the right red nucleus and the right caudate nucleus may be indicative of the brain activating a compensation mechanism to compensate for the decrease in the volume of the gyrus precentralis.Consequently, it is reasonable to assume that motor dysfunction arising from the gyrus precentralis is compensated by the right red nucleus and the right caudate nucleus.
In the current study, it was observed that individuals with AIS exhibited lower volumes in several brain regions responsible for interhemispheric coordination, including the splenium of the corpus callosum, body of the corpus callosum, inferior fronto-occipital fasciculus, and superior fronto-occipital fasciculus (the latter could be a part of the anterior internal capsule), (Tables S4 and S5).
White matter primarily consists of axons enveloped by a lipid-rich myelin sheath, which restricts diffusion and plays a crucial role in providing fast and efficient connections between the cortex and subcortical regions. 8Payas et al., 5 in their study, reported that individuals with scoliosis experienced notable issues, particularly in the white matter of the brain.There have been observations that the volume and the number of fibers in the corpus callosum, responsible for interhemispheric communication, are significantly reduced in individuals with AIS in comparison to their healthy counterparts. 6In a study by Joly et al., which involved individuals with AIS exhibiting right thoracic and right thoracic-left lumbar curvature, it was reported that the anatomical location of the corpus callosum was lower than in healthy individuals. 42These findings support the notion that abnormalities in the brain's white matter play a role in the etiopathogenesis of AIS, leading to interhemispheric coordination deficits.
The study findings suggest that individuals with scoliosis may experience challenges in healthy communication between the right and left brain hemispheres, as well as within the same hemisphere.
The observed issues in these brain structures might lead to compromised coordination between sensorimotor regions and hinder the ideal motor responses required for postural control.

| Study limitations
The limitation of this study is that it included only female individuals with AIS.Additionally, the sample size of this study for individuals with AIS was relatively small, and the degree of curvature among the participants was notably severe, with an average curvature degree of 52.5.As a result, the findings should not be extrapolated to patients with milder curvatures.To address this limitation, further research is essential, involving a more extensive patient cohort that encompasses varying curvature severities and includes both genders.
This will facilitate a more comprehensive understanding of AIS across a broader spectrum of cases.

| CONCLUSION
This study has illustrated the robust predictive capabilities of the RF model in scoliosis prediction, particularly when considering the volumetric attributes of brain regions.Furthermore, it was discerned that the most influential variables for prediction were derived from the volumetric measurements of specific brain regions, including the right corticospinal tract, right corpus callosum body, right corpus callosum splenium, right cerebellum, and right pons.Additionally, the study identified that individuals with AIS exhibited more pronounced disparities in the right brain hemisphere associated with the concave side of the curvature.These findings suggest that, despite the primary impact on brain regions related to the sense of vision in individuals with scoliosis, motor functions and the white matter structures involved in intra-hemispheric and inter-hemispheric communication are also notably affected.It is important to note that delayed detection of scoliosis can lead to enduring damage to anatomical structures such as muscles, bones, joint capsules, and ligaments surrounding the spine.Given that AIS typically advances without causing pain and progresses rapidly, early diagnosis can be challenging.In light of the results presented in this study, the identification of regional disparities through brain MRI using ML algorithms holds promise for aiding in the diagnosis of AIS.This method may offer the potential to predict the development of scoliosis, particularly in individuals at higher risk.

A
total of 63 individuals, 32 with AIS and 31 healthy individuals, who were diagnosed with AIS at the Orthopedics and Traumatology Polyclinic of Kayseri City Training and Research Hospital, were included in the study.Since AIS is more common in female gender, only female participants were included in the study to minimize gender and hormonal differences.The inclusion criteria for individuals with AIS were as T A B L E 1 Demographic characteristics of the participants.

2 . 3 |
follows: being right-handed dominant, having Lenke Type I scoliosis (the apex of the major curvature is on the right side), having a major curvature angle between 30-70 degrees, and having scoliosis diagnosed with AIS.Exclusion criteria in the study were determined as follows: Using the left hand as the dominant hand; presence of scoliosis other than AIS; the presence of neurological, psychiatric, muscular, rheumatic or orthopedic diseases.Data acquisition MRI imaging of the participants was performed with the Dutch brand 3 T (Tesla) Siemens Magnetom Skyra device.Participants' MRI and Diffusion tensor imaging (DTI), (It is used to obtain information about the microarchitecture of brain white matter tissue), sequence settings were as follows; TR = 4900 ms, TE = 95 ms, Number-of-Slice = 36, Flip Angle = 90o, FOV = 230 Â 230 mm2, matrix = 128 Â 128 and slice thickness = 3.0 mm (voxel size 1.8 Â 1.8 Â 3.5 mm).The resulting images were converted to DICOM format and saved.
were evaluated.The data set was randomly shuffled and split into two thirds training data and the rest test data.In training the models, training data based on five times repeated 5-fold cross validation was used and independent test data for testing.The data were preprocessed using standardization (i.e.centering and scaling) to eliminate the effect of unit variations.A grid search was used for the tuning parameters corresponding to each ML model and thirty possible combinations were tested.The model performance criteria are measured as Accuracy, AUC (area under the roc curve), F1 and Brier scores with confidence intervals.The variable importance plots and ROC curve are finally presented utilizing the best model.During the determination of the best model, ROC curve is a quite beneficial and commonly used criterion in terms of assessing the performance of a classifier by illustrating the trade-off between the true positive rate

are
corticospinal tract, body of corpus callosum, splenium of corpus callosum, cerebellum and pons measurements, respectively.It is notable that the variables considered important by the RF algorithm are also statistically significant variables.Unlike statistical significance tests, measurements such as right fusıform gyrus, posterior limb of internal capsule, right optic tract were found to be significant in the ML model.Statistical analysis and ML results based on left cerebral hemisphere measurements were separately obtained to provide additional insights.It is observed that parahippocampal gyrus, splenium of corpus callosum, tapatum, and nucleus accumbens measurements are statistically significant in the left cerebral hemisphere measurements.In general, there are relatively few significant results in the left cerebral hemisphere compared with the right hemisphere (Appendix B, Table F I G U R E 1 The ROC curve for the RF model.F I G U R E 2 Relative importance scores of the twenty most important variables based on the RF model of the right cerebral hemisphere.

4 |
DISCUSSION In the current study, ML-based modeling approaches were used to predict scoliosis based on volumetric brain features.The results indicated that the RF model showed the best performance.Moreover, the most important variables were found to be right corticospinal tract, right body of corpus right callosum, right splenium of corpus callosum, right cerebellum, and right pons volumetric measurements, respectively (Figure 2).ML algorithms have emerged as a promising tool for improving the prediction and early diagnosis of scoliosis.Several studies have explored the development and implementation of algorithms based on deep learning and ML in the context of scoliosis.These investigations have predominantly used data derived from spine radiographs associated the reduced number of fibers in the corticospinal tracts of individuals with scoliosis with decreased muscle strength.Furthermore, Kocaman et al. 36 reported that muscle fibers on the convex side of the major curve in individuals with scoliosis are larger compared to the concave side, but the paraspinal muscle fibers on both sides of individuals with AIS are less developed compared to healthy individuals.This decrease in gyrus precentralis and tractus corticospinalis volumes detected in this study may be closely related to the weaker muscles on the concave side of the curvature in individuals with scoliosis.
and MedCalc 21 softwares.In the comparison of cerebral hemisphere measurements among the healthy and patient groups, independent samples t test, Welch t test or Mann-Whitney U test were performed depending on whether the assumptions of normality and homogeneity of variance were met, and accordingly, the appropriate test result was reported.The distribution of the data (i.e.normality), which is one of the parametric test assumptions, was tested by Kolmogrov-Smirnov test and homogeneity of variance was tested by Levene test.Statistical significance value was set at 0.05.
respectively.From the criteria evaluated in the study, it is seen that the RF algorithm is the best algorithm in both training and test data when the Accuracy, AUC and F1-score values Performance metrics (mean and 95% confidence interval) of algorithms on repeated cross-validation of training set for right cerebral hemisphere measurements.Performance metrics (mean and 95% confidence interval) of algorithms performance based on independent testing set for right cerebral hemisphere measurements.
are high and the Brier score is low as a measure of a good model.The training performance of the RF algorithm is quite convincing with an accuracy of 0.9083 (CI: 0.8679, 0.9488), AUC of 0.993 (CI: 0.9870, 0.9990), F1-score of 0.9170 (CI: 0.8794, 0.9546) and Brier score of 0.1256 (CI: 0.1151, 0.1360).Similarly, it is seen that it provides considerably high results (Accuracy: 0.8500, AUC: 0.9550, F1-score: 0.9612 and Brier score: 0.1622) in the test data.In addition, the ROC Abbreviation: AUC, area under the receiver operating curve."Bold"indicates the best performing model.T A B L E 3Abbreviation: AUC, area under the receiver operating curve."Bold" indicates the best performing model.